May 22, 2023

Challenge 12: Data Science Job Salaries


Dataset on Kaggle - See challenge questions below.
Link to GitHub repository, including data and final output.
Click the bottom right corner of the window above to view in full-screen mode.

1. About the dataset
The dataset contains a table of records for data science positions and related information, such as salary, location, year recorded, experience level. A copy of the data dictionary is below. 2. Challenge questions
There are nine questions in total, querying different aspects of the dataset.
  1. Which role has the highest salary employment wise?
  2. Which employment types do employers prefer to hire?
  3. Which role are entry leveled generally hired for?
  4. Which countries pay the highest for which roles?
  5. What insights can you find regarding employee demographics?
  6. Which experience level has the highest hiring?
  7. Does company size affect the rate of hiring and pay scale?
  8. What is the year over year (YoY) salary growth at different levels?
  9. Create a dashboard to summarize your insights.
3. Data cleaning and transformation
  1. After importing the dataset to the Power Query Editor, double-checked the data type, distribution, and values matching the data dictionary.
  2. Named the first column as ID.
  3. Replaced job_title as Position
  4. Replaced acronyms in Experience, Employment type and Company size columns with more descriptive values.
  5. Imported and merged an ISO 3166 country code table and replaced the country codes with the country names.
  6. Removed non-USD salary columns.
4. Results
Questions 1-8 were answered using Pivot tables (Summary Tab), from which I created the pivot charts for question 9 (Dashboard tab).